Handwork

My policy on handwork is that it is required for Statistics PhD students and MS in Statistics students. It is encouraged (but not required) for MS in Applied Statistics and all other students. (If you are the latter, you will not get penalized if you get these wrong … )

Exercises from the book: 23.2, 23.4 23.2

You can type your answers and submit within this file or you can do this work on paper and submit a scanned copy (be sure it is legible).

Data analysis

1. Exercise D23.2 (MLM)

The file Snijders.txt contains data on 4106 grade-8 students (who are approximately 11 years old) in 216 primary schools in the Netherlands. The data are used for several examples, somewhat different from the analysis that we will pursue below, by Snijders and Boskers in Multilevel Analysis, 2nd Edition (Sage, 2012).

The data set includes the following variables: • school: a (non-consecutive) ID number indicating which school the student attends. • iq: the student’s verbal IQ score, ranging from 4 to 18.5 (i.e., not traditionally scaled to a population mean of 100 and standard deviation of 15). • test: the student’s score on an end-of-year language test, with scores ranging from 8 to 58. • ses: the socioeconomic status of the student’s family, with scores ranging from 10 to 50. • class.size: the number of students in the student’s class, ranging from 10 to 42; this variable is constant within schools, apparently reflecting the fact that all of the students in each school were in the same class. • meanses: the mean SES in the student’s school, calculated from the data; the original data set included the school-mean SES, but this differed from the values that I computed directly from the data, possibly it was based on all of the students in the school. • meaniq: the mean IQ in the student’s school, calculated (for the same reason) from the data.

There are some missing data, and I suggest that you begin by removing cases with missing data. How many students are lost when missing data are removed in this manner? Then create and add the following two variables to the data set:

  • SES_c : school-centred SES, computed as the difference between each student’s SES and the mean of his or her school; and

  • IQ_c : school-centred IQ.

  1. Examine scatterplots of students’ test scores by centered SES and centred IQ for each of 20 randomly sampled schools. Do the relationships in the scatterplots seem reasonable linear? Hint: In interpreting these scatterplots, take into account the small number of students in each school, ranging from 4 to 34 in the full data set.
data <- read.table("Datasets/Snijders.txt", header = TRUE)
data <- na.omit(data)
data$SES_c <- with(data, ses - meanses)
data$IQ_c <- with(data, iq - meaniq)

schools <- sample(unique(data$school), 20)
par(mfrow = c(5, 4), mar = c(2.5, 2.5, 1, 1))
for (i in 1:length(schools)) {
  school_data <- data[data$school == schools[i], ]
  plot(school_data$SES_c, school_data$test, main = paste("School", schools[i]), 
       xlab = "SES_c", ylab = "Test Score")
  abline(lm(test ~ SES_c, data = school_data))
  plot(school_data$IQ_c, school_data$test, main = paste("School", schools[i]), 
       xlab = "IQ_c", ylab = "Test Score")
  abline(lm(test ~ IQ_c, data = school_data))
}

The first image displays a collection of scatterplots representing the test scores of 20 arbitrarily selected schools, plotted against the centered SES of each school. The second graph presents scatterplots depicting the same schools’ test scores against the centered IQ of each school.

Since each school has a small number of students, it is difficult to draw definitive conclusions about the linearity of the scatterplots. However, it is apparent that the connection between test scores and SES or IQ is mostly weakly linear or nonlinear. Some scatterplots demonstrate a clear linear relationship, while others present no evident trend or a nonlinear relationship. In general, the associations appear to differ from one school to the next.

  1. Regress the students’ test scores on centred SES and centred IQ within schools for the full dataset – that is, compute a separate regression for each school. Then plot each set of coefficients (starting with the intercepts) against the schools’ mean SES, mean IQ, and class size. Do the coefficients appear to vary systematically by the schools’ characteristics (i.e., by the Level 2 explanatory variables centred SES, centred IQ, and class size)?
model <- lmer(test ~ SES_c + IQ_c + (1 + SES_c + IQ_c | school), data=data)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0515772 (tol = 0.002, component 1)
coefs <- coef(model)$school

plot(data$meanses[match(unique(data$school), data$school)], coefs$`(Intercept)`, xlab = "Mean SES", ylab = "Intercept")

plot(data$meaniq[match(unique(data$school), data$school)], coefs$`(Intercept)`, xlab = "Mean IQ", ylab = "Intercept")

plot(data$class.size[match(unique(data$school), data$school)], coefs$`(Intercept)`, xlab = "Class Size", ylab = "Intercept")

plot(data$meaniq[match(unique(data$school), data$school)], coefs$SES_c, xlab = "Mean SES", ylab = "Slope for IQ_c")

plot(data$meanses[match(unique(data$school), data$school)], coefs$SES_c, xlab = "Class Size", ylab = "Slope for SES_c")

plot(data$meaniq[match(unique(data$school), data$school)], coefs$IQ_c, xlab = "Mean SES", ylab = "Slope for IQ_c")

plot(data$meanses[match(unique(data$school), data$school)], coefs$IQ_c, xlab = "Class Size", ylab = "Slope for SES_c")

plot(data$class.size[match(unique(data$school), data$school)], coefs$SES_c, xlab = "Class Size", ylab = "Slope for SES_c")

plot(data$class.size[match(unique(data$school), data$school)], coefs$IQ_c, xlab = "Class Size", ylab = "Slope for IQ_c")

The scatterplots for intercepts illustrate that there is a positive connection between mean SES and mean IQ with higher test scores, although the strength of this association varies significantly among schools. This implies that schools with higher mean SES and mean IQ tend to have higher test scores, but there may be other important factors that influence test scores as well.

Similarly, the scatterplots for slopes of centred SES and centred IQ also demonstrate a positive correlation with mean SES and mean IQ, but the relationship is weaker than that of the intercepts. This suggests that the effect of centred SES and centred IQ on test scores is more pronounced in schools with higher mean SES and mean IQ, but there is still considerable variation across schools.

In contrast, the scatterplots reveal no clear pattern between the intercepts or slopes and class size. This indicates that class size may not play a significant role in explaining the differences in test scores across schools.

  1. Fit linear mixed-effects models to the Snijders and Boskers data, proceeding as follows:
  • Begin with a one-way random-effects ANOVA of test scores by schools. What proportion of the total variation in test scores among students is between schools (i.e., what is the intra-class correlation)?
# Fit the one-way random-effects ANOVA
model <- lmer(test ~ (1 | school), data = data)

vc <- VarCorr(model)
vc
##  Groups   Name        Std.Dev.
##  school   (Intercept) 4.2743  
##  Residual             7.8909
icc <- 4.2743^2 / (4.2743^2 + sigma(model)^2)
icc
## [1] 0.2268526

Upon conducting the one-way random-effects ANOVA, we have computed an intra-class correlation of 0.226. This result suggests that around 22.6% of the overall variation in test scores among students is due to the differences between schools. This indicates that there is a significant amount of variation in test scores across schools that cannot be explained by individual-level factors like SES and IQ alone.

  • Fit a random-coefficients regression of test scores on the students’ centered SES and centered IQ. Initially include random effects for the intercept and both explanatory variables. Test whether each of these random effects is needed, and eliminate from the model those that are not (if any are not). How, if at all, are test scores related to the explanatory variables? Note: You may obtain a convergence warning in fitting one or more of the null models that remove variance and covariance components; this warning should not prevent you from performing the likelihood-ratio test for the corresponding random effects.
model <- lmer(test ~ SES_c + IQ_c + (1 + SES_c + IQ_c | school), data = data)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0515772 (tol = 0.002, component 1)
summary(model)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: test ~ SES_c + IQ_c + (1 + SES_c + IQ_c | school)
##    Data: data
## 
## REML criterion at convergence: 23589.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.2115 -0.6340  0.0674  0.7038  2.9009 
## 
## Random effects:
##  Groups   Name        Variance  Std.Dev. Corr       
##  school   (Intercept) 20.828505 4.56383             
##           SES_c        0.001372 0.03704  -0.02      
##           IQ_c         0.276992 0.52630  -0.56 -0.81
##  Residual             37.042593 6.08626             
## Number of obs: 3576, groups:  school, 199
## 
## Fixed effects:
##              Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept)  40.84201    0.34212 186.26466  119.38   <2e-16 ***
## SES_c         0.17334    0.01220 417.52503   14.21   <2e-16 ***
## IQ_c          2.23828    0.06997 162.24958   31.99   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##       (Intr) SES_c 
## SES_c -0.004       
## IQ_c  -0.293 -0.329
## optimizer (nloptwrap) convergence code: 0 (OK)
## Model failed to converge with max|grad| = 0.0515772 (tol = 0.002, component 1)

We can test whether each of the random effects is needed using likelihood ratio tests. We first fit a null model without any random effects:

null_model <- lmer(test ~ SES_c + IQ_c + (1 | school), data = data)
anova(null_model, model)
## refitting model(s) with ML (instead of REML)
## Data: data
## Models:
## null_model: test ~ SES_c + IQ_c + (1 | school)
## model: test ~ SES_c + IQ_c + (1 + SES_c + IQ_c | school)
##            npar   AIC   BIC logLik deviance  Chisq Df Pr(>Chisq)    
## null_model    5 23620 23651 -11805    23610                         
## model        10 23598 23660 -11789    23578 31.579  5  7.198e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Based on the likelihood ratio test, it has been determined that the model incorporating both random intercept and random slope for centered SES is a significantly better fit compared to the model with only a random intercept. Hence, it can be inferred that including both random intercept and random slope for centered SES is essential for the model.

null_model = lmer(test ~ SES_c+IQ_c + (1 + IQ_c| school), data = data)
anova(null_model, model)
## refitting model(s) with ML (instead of REML)
## Data: data
## Models:
## null_model: test ~ SES_c + IQ_c + (1 + IQ_c | school)
## model: test ~ SES_c + IQ_c + (1 + SES_c + IQ_c | school)
##            npar   AIC   BIC logLik deviance Chisq Df Pr(>Chisq)
## null_model    7 23595 23638 -11790    23581                    
## model        10 23598 23660 -11789    23578 2.616  3     0.4547
null_model = lmer(test ~ SES_c + IQ_c + (1 + SES_c| school), data = data)
## boundary (singular) fit: see help('isSingular')
anova(null_model, model)
## refitting model(s) with ML (instead of REML)
## Data: data
## Models:
## null_model: test ~ SES_c + IQ_c + (1 + SES_c | school)
## model: test ~ SES_c + IQ_c + (1 + SES_c + IQ_c | school)
##            npar   AIC   BIC logLik deviance  Chisq Df Pr(>Chisq)    
## null_model    7 23622 23665 -11804    23608                         
## model        10 23598 23660 -11789    23578 29.876  3  1.466e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

According to the likelihood ratio test results comparing the null model to the full model, it can be inferred that adding a random slope for centered SES did not result in a significant improvement in model fit (as indicated by the p-value of 0.4547). Hence, it can be concluded that the model can be simplified by removing the random slope for centered SES.
The final random coefficients model is:

final_model = lmer(test ~ SES_c + IQ_c + (1 + IQ_c| school), data = data)
summary(final_model)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: test ~ SES_c + IQ_c + (1 + IQ_c | school)
##    Data: data
## 
## REML criterion at convergence: 23591.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.2087 -0.6285  0.0664  0.7032  2.8980 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  school   (Intercept) 20.8733  4.5687        
##           IQ_c         0.2322  0.4819   -0.62
##  Residual             37.1675  6.0965        
## Number of obs: 3576, groups:  school, 199
## 
## Fixed effects:
##              Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept) 4.084e+01  3.425e-01 1.856e+02  119.25   <2e-16 ***
## SES_c       1.736e-01  1.186e-02 3.360e+03   14.64   <2e-16 ***
## IQ_c        2.237e+00  6.812e-02 1.757e+02   32.84   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##       (Intr) SES_c 
## SES_c  0.000       
## IQ_c  -0.305 -0.233

The results indicate that both family IQ and individual IQ have a positive correlation with test performance. The model’s random intercepts and slopes for centered IQ suggest that the impact of IQ on test performance varies among schools.

  • Introduce mean school SES, mean school IQ, and class size as Level 2 explanatory variable, but only for the Level 1 coefficients that were found to vary significantly among schools in the random-coefficients model. Hint: Recall that modeling variation in Level 1 coefficients by Level 2 explanatory variables implies the inclusion of cross-level interactions in the model; and don’t forget that the intercepts are Level 1 coefficients that may depend on Level 2 explanatory variables. It may well help to write down the mixed-effects model first in hierarchical form and then in Laird-Ware form. Test whether the random effects that you retained in the random-coefficients model are still required now that there are Level 2 predictors in the model. Note: Again, you may obtain a convergence warning.
m2 <- lmer(test ~ SES_c + IQ_c + meanses + meaniq + class.size + IQ_c:meanses + IQ_c:meaniq + IQ_c * class.size +
            (IQ_c | school) + (1 | school), data = data)
## boundary (singular) fit: see help('isSingular')
## Warning: Model failed to converge with 1 negative eigenvalue: -6.2e-01
summary(m2)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: test ~ SES_c + IQ_c + meanses + meaniq + class.size + IQ_c:meanses +  
##     IQ_c:meaniq + IQ_c * class.size + (IQ_c | school) + (1 |      school)
##    Data: data
## 
## REML criterion at convergence: 23462.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.2716 -0.6225  0.0748  0.7128  2.8980 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  school   (Intercept)  3.2530  1.8036        
##           IQ_c         0.2269  0.4763   -1.00
##  school.1 (Intercept)  5.6541  2.3778        
##  Residual             37.0582  6.0875        
## Number of obs: 3576, groups:  school, 199
## 
## Fixed effects:
##                   Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)     -2.493e+00  3.312e+00  2.228e+02  -0.753  0.45244    
## SES_c            1.744e-01  1.186e-02  3.367e+03  14.707  < 2e-16 ***
## IQ_c             3.071e+00  9.560e-01  1.980e+02   3.213  0.00154 ** 
## meanses          8.957e-02  4.582e-02  1.860e+02   1.955  0.05210 .  
## meaniq           3.515e+00  3.099e-01  2.139e+02  11.344  < 2e-16 ***
## class.size      -2.043e-02  4.241e-02  2.012e+02  -0.482  0.63044    
## IQ_c:meanses    -2.140e-02  1.255e-02  1.424e+02  -1.706  0.09020 .  
## IQ_c:meaniq     -3.352e-02  9.024e-02  1.974e+02  -0.371  0.71069    
## IQ_c:class.size  5.539e-03  1.252e-02  1.805e+02   0.442  0.65880    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) SES_c  IQ_c   meanss meaniq clss.s IQ_c:mns IQ_c:mnq
## SES_c        0.000                                                     
## IQ_c        -0.255 -0.026                                              
## meanses      0.251  0.002 -0.071                                       
## meaniq      -0.911 -0.001  0.233 -0.510                                
## class.size  -0.253  0.000  0.072 -0.201  0.000                         
## IQ_c:meanss -0.072 -0.048  0.301 -0.293  0.149  0.058                  
## IQ_c:meaniq  0.230  0.022 -0.911  0.142 -0.256 -0.003 -0.542           
## IQ_c:clss.s  0.070  0.006 -0.259  0.054 -0.002 -0.267 -0.166   -0.023  
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

From the result, we can see that the class.size is not significant.

  • Compute tests of the various main effects and interactions in the coefficients-as-outcomes model. Then simplify the model by removing any fixed-effects terms that are nonsignificant. Finally, interpret the results obtained for the simplified model. If your final model includes interactions, you may wish to construct effect displays to visualize the interactions.
m3 <- lmer(test ~ SES_c + IQ_c + meanses + meaniq + IQ_c:meanses + IQ_c:meaniq + (IQ_c | school) + (1 | school), data = data)
## boundary (singular) fit: see help('isSingular')
summary(m3)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: test ~ SES_c + IQ_c + meanses + meaniq + IQ_c:meanses + IQ_c:meaniq +  
##     (IQ_c | school) + (1 | school)
##    Data: data
## 
## REML criterion at convergence: 23451.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.2756 -0.6251  0.0756  0.7144  2.8955 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  school   (Intercept)  3.2982  1.8161        
##           IQ_c         0.2232  0.4724   -1.00
##  school.1 (Intercept)  5.5659  2.3592        
##  Residual             37.0583  6.0876        
## Number of obs: 3576, groups:  school, 199
## 
## Fixed effects:
##                Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)    -2.89462    3.19831  225.03966  -0.905  0.36641    
## SES_c           0.17438    0.01186 3367.87255  14.707  < 2e-16 ***
## IQ_c            3.18469    0.92113  194.45600   3.457  0.00067 ***
## meanses         0.08513    0.04479  187.89311   1.901  0.05889 .  
## meaniq          3.51520    0.30932  215.06410  11.364  < 2e-16 ***
## IQ_c:meanses   -0.02040    0.01234  145.72777  -1.654  0.10037    
## IQ_c:meaniq    -0.03323    0.09001  197.78297  -0.369  0.71233    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) SES_c  IQ_c   meanss meaniq IQ_c:mns
## SES_c        0.000                                     
## IQ_c        -0.254 -0.025                              
## meanses      0.211  0.002 -0.060                       
## meaniq      -0.942 -0.001  0.241 -0.521                
## IQ_c:meanss -0.060 -0.048  0.271 -0.293  0.151         
## IQ_c:meaniq  0.238  0.022 -0.950  0.145 -0.257 -0.553  
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

Based on the results, the variables meanses and the interaction terms involving meanses are not statistically significant, indicating that they do not have a significant impact on the test scores. Therefore, we can simplify our model by removing these variables, and the final model becomes:

final_model <- lmer(test ~ SES_c + IQ_c +  meaniq  + (IQ_c | school) + (1 | school), data = data)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## unable to evaluate scaled gradient
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge: degenerate Hessian with 1 negative eigenvalues
## Warning: Model failed to converge with 1 negative eigenvalue: -6.7e-02
summary(final_model)
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: test ~ SES_c + IQ_c + meaniq + (IQ_c | school) + (1 | school)
##    Data: data
## 
## REML criterion at convergence: 23443.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.2629 -0.6382  0.0724  0.7097  2.9090 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  school   (Intercept)  4.0795  2.0198        
##           IQ_c         0.2438  0.4938   -0.91
##  school.1 (Intercept)  4.9194  2.2180        
##  Residual             37.0409  6.0861        
## Number of obs: 3576, groups:  school, 199
## 
## Fixed effects:
##               Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)   -2.93120    3.03102  232.08286  -0.967    0.335    
## SES_c          0.17358    0.01184 3370.41414  14.655   <2e-16 ***
## IQ_c           2.23712    0.06864  173.99910  32.593   <2e-16 ***
## meaniq         3.71562    0.25596  229.85654  14.517   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##        (Intr) SES_c  IQ_c  
## SES_c   0.001              
## IQ_c   -0.023 -0.230       
## meaniq -0.997 -0.001  0.000
## optimizer (nloptwrap) convergence code: 0 (OK)
## unable to evaluate scaled gradient
## Model failed to converge: degenerate  Hessian with 1 negative eigenvalues

The mixed effects model summary presents the relationship between test scores and three predictor variables, namely centered SES (SES_c), centered IQ (IQ_c), and mean IQ (meaniq). The model accounts for the random effects of intercepts and slopes across schools. The results indicate that all three predictor variables have a statistically significant association with test scores (p < .001).
Specifically, for a one-unit increase in SES_c, there is a 0.174 increase in test scores. Similarly, for a one-unit increase in IQ_c, there is a 2.237 increase in test scores, and for a one-unit increase in meaniq, there is a 3.716 increase in test scores. The intercept (-2.931) represents the estimated test score when all predictor variables are zero.
Moreover, the random effects estimates suggest that there is significant variation in intercepts and slopes of IQ_c across schools. The variance of the random intercept is 4.08, indicating substantial unexplained variability in test scores across schools. The variance of the random slope for IQ_c is 0.244, suggesting significant variability in the relationship between IQ_c and test scores across schools.

2. Exercise D23.2 (Binary version)

Repeat Problem (1) but now, instead of using test as the outcome, you will use a dichotomized version. To do so, create a new variable called high_pass that indicates if a student receives a score of 90% or above.

Par particular attention to interpretation and to how your results compare with those based on the continuous version. Are your results similar or do they differ? Explain why or why not.

data$high_pass <- ifelse(data$test > 52, 1, 0)
model <- glmer(high_pass ~ 1 + (1 | school), data = data)
## Warning in glmer(high_pass ~ 1 + (1 | school), data = data): calling glmer()
## with family=gaussian (identity link) as a shortcut to lmer() is deprecated;
## please call lmer() directly
vc <- VarCorr(model)
vc
##  Groups   Name        Std.Dev.
##  school   (Intercept) 0.066921
##  Residual             0.273786
icc <- 0.066921^2 / (0.066921^2 + sigma(model)^2)
icc
## [1] 0.05637702

The intra-class correlation coefficient of 0.056 suggests that around 5.6% of the total variability in test scores among students is attributable to differences between schools, indicating that there is some residual variation in test scores across schools that cannot be accounted for by individual-level factors such as SES and IQ. Although this proportion is relatively small, we have still applied linear mixed-effects models to investigate any potential associations.

model <- glmer(high_pass ~ SES_c + IQ_c + (1 + SES_c + IQ_c | school), family = binomial, data = data)
## boundary (singular) fit: see help('isSingular')
S(model)
## Generalized linear mixed model fit by ML
## Call: glmer(formula = high_pass ~ SES_c + IQ_c + (1 + SES_c + IQ_c | school),
##             data = data, family = binomial)
## 
## Estimates of Fixed Effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -3.37912    0.16423 -20.576  < 2e-16 ***
## SES_c        0.05603    0.01078   5.195 2.04e-07 ***
## IQ_c         0.56439    0.05242  10.766  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Exponentiated Fixed Effects and Confidence Bounds:
##               Estimate      2.5 %     97.5 %
## (Intercept) 0.03407746 0.02469884 0.04701732
## SES_c       1.05762521 1.03550636 1.08021652
## IQ_c        1.75837676 1.58668601 1.94864567
## 
## Estimates of Random Effects (Covariance Components):
##  Groups Name        Std.Dev. Corr       
##  school (Intercept) 1.08639             
##         SES_c       0.02568  -0.10      
##         IQ_c        0.01125  -0.84 -0.46
## 
## Number of obs: 3576, groups:  school, 199
## 
##  logLik      df     AIC     BIC 
## -868.30       9 1754.59 1810.23

After including the level 1 variables, we can see the model is significant. Similarly, we try to test whether each of the random effects is needed using likelihood ratio tests.

null_model <- glmer(high_pass ~ SES_c + IQ_c + (1 | school), family = binomial, data = data)
anova(model, null_model)
## Data: data
## Models:
## null_model: high_pass ~ SES_c + IQ_c + (1 | school)
## model: high_pass ~ SES_c + IQ_c + (1 + SES_c + IQ_c | school)
##            npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## null_model    4 1745.2 1770.0 -868.63   1737.2                     
## model         9 1754.6 1810.2 -868.30   1736.6 0.6623  5      0.985

Thus, we do not need the random effects for SES_c or IQ_c. And we only need the intercepts. Then, we include level2 covariates.

with_leve_two <- glmer(high_pass ~ SES_c + IQ_c + meanses + meaniq + 
                              class.size +  (1 | school), family = binomial, data = data)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.00541017 (tol = 0.002, component 1)
null_model = glm(high_pass ~ SES_c + IQ_c + meanses + meaniq + 
                              class.size , family = binomial, data = data)

anova(with_leve_two, null_model)
## Data: data
## Models:
## null_model: high_pass ~ SES_c + IQ_c + meanses + meaniq + class.size
## with_leve_two: high_pass ~ SES_c + IQ_c + meanses + meaniq + class.size + (1 | school)
##               npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## null_model       6 1756.3 1793.4 -872.15   1744.3                         
## with_leve_two    7 1706.1 1749.4 -846.06   1692.1 52.179  1  5.065e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We still need the random intercepts.

S(with_leve_two)
## Generalized linear mixed model fit by ML
## Call: glmer(formula = high_pass ~ SES_c + IQ_c + meanses + meaniq + class.size
##             + (1 | school), data = data, family = binomial)
## 
## Estimates of Fixed Effects:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -13.609515   1.703749  -7.988 1.37e-15 ***
## SES_c         0.055421   0.007517   7.373 1.67e-13 ***
## IQ_c          0.561610   0.042839  13.110  < 2e-16 ***
## meanses       0.009025   0.018760   0.481    0.630    
## meaniq        0.816292   0.146813   5.560 2.70e-08 ***
## class.size    0.012461   0.018346   0.679    0.497    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Exponentiated Fixed Effects and Confidence Bounds:
##                 Estimate        2.5 %       97.5 %
## (Intercept) 1.228748e-06 4.357419e-08 3.464943e-05
## SES_c       1.056986e+00 1.041527e+00 1.072674e+00
## IQ_c        1.753493e+00 1.612276e+00 1.907078e+00
## meanses     1.009066e+00 9.726375e-01 1.046859e+00
## meaniq      2.262096e+00 1.696459e+00 3.016328e+00
## class.size  1.012539e+00 9.767775e-01 1.049610e+00
## 
## Estimates of Random Effects (Covariance Components):
##  Groups Name        Std.Dev.
##  school (Intercept) 0.8738  
## 
## Number of obs: 3576, groups:  school, 199
## 
##  logLik      df     AIC     BIC 
## -846.06       7 1706.12 1749.40

We need to furtherly test if meanses and class.size are needed.

reduced = glmer(high_pass ~ SES_c + IQ_c + meaniq + 
                              (1 | school), family = binomial, data = data)
anova(reduced, with_leve_two)
## Data: data
## Models:
## reduced: high_pass ~ SES_c + IQ_c + meaniq + (1 | school)
## with_leve_two: high_pass ~ SES_c + IQ_c + meanses + meaniq + class.size + (1 | school)
##               npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
## reduced          5 1703.0 1733.9 -846.49   1693.0                     
## with_leve_two    7 1706.1 1749.4 -846.06   1692.1 0.8586  2      0.651

Thus, we can observe that we only need SES_c, IQ_c, meaniq and random intercepts in the model. Compared to the previous result, we do not need random effect for IQ_c in our final model.

final_model = glmer(high_pass ~ SES_c + IQ_c + meaniq + 
                              (1 | school),family = binomial,  data = data)
S(final_model)
## Generalized linear mixed model fit by ML
## Call: glmer(formula = high_pass ~ SES_c + IQ_c + meaniq + (1 | school), data =
##             data, family = binomial)
## 
## Estimates of Fixed Effects:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -13.449776   1.603282  -8.389  < 2e-16 ***
## SES_c         0.055602   0.007513   7.401 1.35e-13 ***
## IQ_c          0.561565   0.042819  13.115  < 2e-16 ***
## meaniq        0.851592   0.131165   6.493 8.44e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Exponentiated Fixed Effects and Confidence Bounds:
##                 Estimate        2.5 %       97.5 %
## (Intercept) 1.441572e-06 6.224727e-08 0.0000333851
## SES_c       1.057177e+00 1.041724e+00 1.0728589455
## IQ_c        1.753414e+00 1.612266e+00 1.9069193277
## meaniq      2.343374e+00 1.812150e+00 3.0303243617
## 
## Estimates of Random Effects (Covariance Components):
##  Groups Name        Std.Dev.
##  school (Intercept) 0.872   
## 
## Number of obs: 3576, groups:  school, 199
## 
##  logLik      df     AIC     BIC 
## -846.49       5 1702.98 1733.89

The estimates for fixed effects reveal that the three predictor variables have a significant association with high_pass (p < .001). Specifically, an increase of one unit in SES corresponds to a 1.057 increase in the odds of high_pass, an increase of one unit in IQ corresponds to a 1.753 increase in the odds of high_pass, and an increase of one unit in mean IQ corresponds to a 2.343 increase in the odds of high_pass. The intercept (-13.449) represents the expected log odds of high_pass when all predictor variables are zero.

The estimate for random effects indicates that there is a significant variation in the intercept of high_pass across schools. The variance of the random intercept is 0.76, which suggests that there is a considerable variation in high_pass across schools that is not explained by the fixed effects.

Overall, the model indicates that SES, IQ, and mean IQ are essential predictors of high_pass, and there is a significant variation in high_pass across schools that cannot be explained by these predictors. However, one should exercise caution when interpreting the results since the model convergence may not be stable.

3. Exercise D23.3 (Longitudinal)

Laird and Fitzmaurice (“Longitudinal Data Modeling,” in Scott, Simonoff, and Marx, eds., The SAGE Handbook of Multilevel Modeling, Sage, 2013) analyze longitudinal data from the MIT Growth and Development Study on the change over time of percent body fat in 162 girls before and after menarch (age at first mentruation). The data are in the file Phillips.txt

  • subject: subject ID number, 1—162.

  • age: age (in years) at the time of measurement; the girls are measured at different ages, and although the measurements are approximately taken annually, the ages are not generally whole numbers.

  • menarche: age at menarch (constant within subjects).

  • age.adjusted: age − age at menarch.

  • body.fat: percentage body fat at the time of measurement.

Laird and Fitzmaurice fit a linear mixed-effects model to the data,

\[ Y_{ij} = \beta_1 +\beta_2 t_{ij-}+\beta _3 t_{ij+}+\delta _{1i}+\delta _{2i}t_{ij-}+\delta _{3i}t_{ij+}+\epsilon _{ij} \]

where

\(Y_{ij}\) is the body-fat measurement for girl \(i\) on occasion \(j\);

\(t_{ij-}\) is adjusted age prior to menarche and 0 thereafter;

\(t_{ij+}\) is adjusted age after menarche and 0 before;

\(\beta_1, \beta_2, \beta_3\) are fixed effects; and

\(\delta_{1i}, \delta_{2i}, \delta_{3i}\) are subject-specific random effects.

  1. Examine the data by plotting body fat versus adjusted age for all of the girls simultaneously; following Laird and Fitzmaurice, add a lowess smooth to the scatterplot. Now randomly select a subset (say, 30) of the girls and plot body fat versus adjusted age separately for each of the selected girls. What can you say about the apparent relationship between body fat and age before and after menarche? Is Laird and Fitzmaurice’s model reasonable given your exploration of the data? Explain what each fixed-effect and random-effect coefficient in the model represents.
phillips <- read.table("Datasets/Phillips.txt", header=TRUE)
ggplot(phillips, aes(x=age.adjusted, y=body.fat, group=subject)) + geom_line()

ggplot(phillips, aes(x=age.adjusted, y=body.fat, group=subject)) + geom_smooth(method="loess")
## `geom_smooth()` using formula = 'y ~ x'

# Sample 30 girls
my_sample <- sample(unique(phillips$subject), 30)
for (gril in my_sample) {
  temp_data <- phillips[phillips$subject == gril,]
  plot(body.fat ~ age.adjusted, data=temp_data, type='l')
}

It seems that the model under discussion incorporates random effects specific to each subject for both the intercept and slopes of adjusted age before and after menarche. This is a logical inclusion as each girl is likely to have a different relationship between age and body fat.
Furthermore, the model encompasses two distinct variables for adjusted age before and after menarche, enabling two separate slopes for the correlation between age and body fat. This provides greater flexibility to the model and may more accurately capture the actual relationship between these variables.

  1. Fit the mixed-effects model as specified by Laird and Fitzmaurice. What do you conclude? Consider the possibility of dropping each of the random effects from the model.
phillips$if_menarche <- ifelse(phillips$age.adjusted < 0, 0, 1)
model <- lmer(body.fat ~ age.adjusted * if_menarche + (1 + age.adjusted + if_menarche | subject), data = phillips)
S(model)
## Linear mixed model fit by REML 
## Call: lmer(formula = body.fat ~ age.adjusted * if_menarche + (1 + age.adjusted
##            + if_menarche | subject), data = phillips)
## 
## Estimates of Fixed Effects:
##                            Estimate Std. Error    z value   Pr(>|z|)   Pr(>|t|)
## (Intercept)               2.005e+01  5.774e-01  1.874e+02  3.472e+01  1.486e-83
## age.adjusted             -1.350e-01  1.495e-01  2.613e+02 -9.031e-01  3.673e-01
## if_menarche               2.587e+00  4.366e-01  1.564e+02  5.926e+00  1.916e-08
## age.adjusted:if_menarche  2.142e+00  1.794e-01  7.038e+02  1.194e+01  4.846e-30
##                           
## (Intercept)              0
## age.adjusted             0
## if_menarche              0
## age.adjusted:if_menarche 0
## 
## Estimates of Random Effects (Covariance Components):
##  Groups   Name         Std.Dev. Corr       
##  subject  (Intercept)  6.5831              
##           age.adjusted 0.8562    0.02      
##           if_menarche  3.1583   -0.37 -0.51
##  Residual              2.9805              
## 
## Number of obs: 1049, groups:  subject, 162
## 
##   logLik       df      AIC      BIC 
## -3012.21       11  6046.42  6100.93

The results present the fixed effects estimates, encompassing the intercept and slopes for adjusted age before and after menarche, as well as the subject-specific random effects for the intercept and slopes.Based on the significant p-values for the fixed effects and the relatively low residual standard deviation, we can infer that the model fits the data well. The coefficients indicate that there is a significant positive relationship between body fat and adjusted age after menarche, while there is no such association before menarche.To evaluate the possibility of eliminating each of the random effects from the model, we can conduct likelihood ratio tests on two additional models and compare them to the original model.

model_no_int <- lmer(body.fat ~ age.adjusted * if_menarche + (age.adjusted | subject), data = phillips)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.00537051 (tol = 0.002, component 1)
anova(model_no_int, model)
## refitting model(s) with ML (instead of REML)
## Data: phillips
## Models:
## model_no_int: body.fat ~ age.adjusted * if_menarche + (age.adjusted | subject)
## model: body.fat ~ age.adjusted * if_menarche + (1 + age.adjusted + if_menarche | subject)
##              npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)   
## model_no_int    8 6049.7 6089.4 -3016.9   6033.7                        
## model          11 6042.1 6096.6 -3010.0   6020.1 13.689  3   0.003361 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
model_no_slope <- lmer(body.fat ~ age.adjusted * if_menarche + (if_menarche  | subject), data = phillips)
anova(model_no_slope, model)
## refitting model(s) with ML (instead of REML)
## Data: phillips
## Models:
## model_no_slope: body.fat ~ age.adjusted * if_menarche + (if_menarche | subject)
## model: body.fat ~ age.adjusted * if_menarche + (1 + age.adjusted + if_menarche | subject)
##                npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## model_no_slope    8 6063.7 6103.4 -3023.9   6047.7                         
## model            11 6042.1 6096.6 -3010.0   6020.1 27.664  3  4.272e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The likelihood ratio tests show that both the random effects terms should be included in the model. Dropping either one of them would result in a worse fit to the data. Therefore, the original model specified by Laird and Fitzmaurice is appropriate for this dataset.